Goto

Collaborating Authors

 confidential computing


Confidential Prompting: Protecting User Prompts from Cloud LLM Providers

Gim, In, Li, Caihua, Zhong, Lin

arXiv.org Artificial Intelligence

Our work tackles the challenge of securing user inputs in cloud-hosted large language model (LLM) serving while ensuring output invariance, model confidentiality, and compute efficiency. We introduce secure multi-party decoding (SMD), which leverages confidential computing to confine user prompts to a trusted execution environment (TEE), namely a confidential virtual machine (CVM), while allowing service providers to generate tokens efficiently. We also introduce a novel cryptographic method, prompt obfuscation (PO), to ensure robustness against reconstruction attacks on SMD. We demonstrate that our approach preserves both prompt confidentiality and LLM serving efficiency. Our solution can enable privacy-preserving cloud LLM serving that handles sensitive prompts, such as clinical records, financial data, and personal information.


Privacy-Preserving Decentralized AI with Confidential Computing

Lee, Dayeol, António, Jorge, Khan, Hisham

arXiv.org Artificial Intelligence

This paper addresses privacy protection in decentralized Artificial Intelligence (AI) using Confidential Computing (CC) within the Atoma Network, a decentralized AI platform designed for the Web3 domain. Decentralized AI distributes AI services among multiple entities without centralized oversight, fostering transparency and robustness. However, this structure introduces significant privacy challenges, as sensitive assets such as proprietary models and personal data may be exposed to untrusted participants. Cryptography-based privacy protection techniques such as zero-knowledge machine learning (zkML) suffers prohibitive computational overhead. To address the limitation, we propose leveraging Confidential Computing (CC). Confidential Computing leverages hardware-based Trusted Execution Environments (TEEs) to provide isolation for processing sensitive data, ensuring that both model parameters and user data remain secure, even in decentralized, potentially untrusted environments. While TEEs face a few limitations, we believe they can bridge the privacy gap in decentralized AI. We explore how we can integrate TEEs into Atoma's decentralized framework.


Confidential Computing on nVIDIA H100 GPU: A Performance Benchmark Study

Zhu, Jianwei, Yin, Hang, Deng, Peng, Zhou, Shunfan

arXiv.org Artificial Intelligence

This report evaluates the performance impact of enabling Trusted Execution Environments (TEE) on nVIDIA H100 GPUs for large language model (LLM) inference tasks. We benchmark the overhead introduced by TEE mode across various LLMs and token lengths, with a particular focus on the bottleneck caused by CPU-GPU data transfers via PCIe. Our results indicate that while there is minimal computational overhead within the GPU, the overall performance penalty is primarily attributable to data transfer. For the majority of typical LLM queries, the overhead remains below 5%, with larger models and longer sequences experiencing nearly zero overhead.


Trustworthy AI Using Confidential Federated Learning

Communications of the ACM

The artificial intelligence (AI) revolution is reshaping industries and transforming the way we live, work, and interact with technology. From AI chatbots and personalized recommendation systems to autonomous vehicles navigating city streets, AI-powered innovations are emerging everywhere. As businesses and organizations harness AI to streamline operations, optimize processes, and drive innovation, the potential for economic growth and societal advancement is immense. Amid this rapid progress, however, it is critical to ensure AI's trustworthiness. Trustworthy AI systems must exhibit certain characteristics, such as reliability, fairness, transparency, accountability, and robustness. Only then can AI systems be depended upon to operate ethically and effectively without causing harm or discrimination.


Creating the First Confidential GPUs

Communications of the ACM

With these considerations in mind, users can proceed to use the H100 GPU in CC mode. A primary goal of delivering CC to customers is that CUDA applications can run unchanged while maximizing the acceleration potential of the underlying hardware and software. CUDA provides lift-and-shift benefits to applications that will be run in CC mode. As a result, the NVIDIA GPU CC architecture is compatible with the CPU architectures that also provide application portability from nonconfidential to CC environments. Given the description so far, it should not be surprising that CC workloads on the GPU perform close to non-CC mode when the amount of compute is large compared with the amount of input data. When the amount of compute is low compared with the input data, the overhead of communicating across the nonsecure interconnect limits the application throughput.


Machine Learning with Confidential Computing: A Systematization of Knowledge

Mo, Fan, Tarkhani, Zahra, Haddadi, Hamed

arXiv.org Artificial Intelligence

Privacy and security challenges in Machine Learning (ML) have become increasingly severe, along with ML's pervasive development and the recent demonstration of large attack surfaces. As a mature system-oriented approach, Confidential Computing has been utilized in both academia and industry to mitigate privacy and security issues in various ML scenarios. In this paper, the conjunction between ML and Confidential Computing is investigated. We systematize the prior work on Confidential Computing-assisted ML techniques that provide i) confidentiality guarantees and ii) integrity assurances, and discuss their advanced features and drawbacks. Key challenges are further identified, and we provide dedicated analyses of the limitations in existing Trusted Execution Environment (TEE) systems for ML use cases. Finally, prospective works are discussed, including grounded privacy definitions for closed-loop protection, partitioned executions of efficient ML, dedicated TEE-assisted designs for ML, TEE-aware ML, and ML full pipeline guarantees. By providing these potential solutions in our systematization of knowledge, we aim at building the bridge to help achieve a much strong TEE-enabled ML for privacy guarantees without introducing computation and system costs.


Business Services Becoming More Reliant on Artificial Intelligence as AI Market Value Exceeds $130 Billion

#artificialintelligence

Artificial Intelligence (AI) has become ubiquitous in the past several years. There is not a part of our businesses, cultures, governments and consumer markets. The continuous research and innovation directed by tech giants are driving the adoption of advanced technologies in industry verticals, such as automotive, healthcare, retail, finance, and manufacturing, staffing and education. Technology has always been an essential element for these industries, but artificial intelligence has brought technology to the center of organizations. For instance, from self-driving vehicles to crucial life-saving medical gear, AI is being infused virtually into every apparatus and program.


New Ideas in Distributed and Cluster Computing part2

#artificialintelligence

Abstract: Federated clustering (FedC) is an adaptation of centralized clustering in federated settings, which aims to cluster data based on a global similarity measure while keeping all data locally. Two of the main challenges of FedC are the non-identically and independently distributed (non-i.i.d.) nature of data across different sources, as well as the need for privacy protection. In this paper, we propose a differentially private federated clustering (DP-FedC) algorithm to deal with these challenges. Unlike most existing algorithms without considering privacy, the proposed DP-FedC algorithm is designed to handle non-convex and non-smooth problems by using differential privacy techniques to guarantee privacy, together with privacy amplification assisted tradeoff between learning performance and privacy protection. Then some theoretical analyses of the performance and privacy of the proposed DP-FedC are presented, showing the impact of privacy protection, data heterogeneity, and partial client participation on learning performance.


Confidential computing provides revolutionary data encryption, UC Berkeley professor says

#artificialintelligence

To further strengthen our commitment to providing industry-leading coverage of data technology, VentureBeat is excited to welcome Andrew Brust and Tony Baer as regular contributors. Confidential computing focuses on potentially revolutionary technology, in terms of impact on data security. In confidential computing, data remains encrypted, not just at rest and in transit, but also in use, allowing analytics and machine learning (ML) to be performed on the data, while maintaining its confidentiality. The capability to encrypt data in use opens up a massive range of possible real-world scenarios, and it has major implications and potential benefits for the future of data security. VentureBeat spoke with Raluca Ada Popa about her research and work in developing practical solutions for confidential computing.


Exploration on Confidential Computing for Big Data & AI using BigDL

#artificialintelligence

Intel Software Guard Extensions (Intel SGX) is a securing computing tool that generates a trusted execution environment (TEE) for users that need secure and confidential environments for such use cases as private key management, multi-party computing with private data, and securing public cloud deployment for critical applications. While the Intel SGX SDK for Linux* OS successfully tackles these important use cases, its implementation is not simple. It can require significant system redesign and code changes by engineers because under the SGX SDK's threat model, the OS is not trusted, and only trusted applications and code can be worked on in the secure environment portioned out by SGX, i.e., an "enclave." Therefore, the trusted and untrusted components of the applications involved need to be separated. Moreover, engineers will then need to re-engineer some of their code base to ensure it will be trusted in this enclave.